Combination of standard and throat microphones for robust speech recognition in highly noisy environments
نویسندگان
چکیده
We present a method to combine standard and throat microphone signals for noise-robust speech recognition. Our approach is to extend the probabilistic optimum filter (POF) mapping algorithm to estimate standard microphone clean speech feature vectors from both microphones’ noisy speech feature vectors. We tested the proposed approach in two noisy speech recognition tasks. In the first task we used a large-vocabulary continuous speech recognition system and noisy speech using either artificially added noise or noise recorded in an M1 tank cockpit. In the second task we used a real-time system and noisy speech recorded in a highly noisy environment, inside a HMMWV military vehicle. A noisecanceling microphone and a throat microphone were used in this task. Because of the highly adverse conditions in this second task we propose an extension of the combined microphone approach, which takes into account the level of noise captured by the throat microphone. The combined microphone approach significantly outperforms the single microphone approach in all the recognition experiments.
منابع مشابه
An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملSpeech intelligibility in noise using throat and acoustic microphones.
INTRODUCTION Helicopter cockpits are very noisy and this noise must be reduced for effective communication. The standard U.S. Army aviation helmet is equipped with a noise-canceling acoustic microphone, but some ambient noise still is transmitted. Throat microphones are not sensitive to air molecule vibrations and thus, transmittal of ambient noise is reduced. It is possible that throat microph...
متن کاملA spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments
A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial pro...
متن کامل